A Survey on Hate Speech Detection using Natural Language Processing
نویسندگان
چکیده
This paper presents a survey on hate speech detection. Given the steadily growing body of social media content, the amount of online hate speech is also increasing. Due to the massive scale of the web, methods that automatically detect hate speech are required. Our survey describes key areas that have been explored to automatically recognize these types of utterances using natural language processing. We also discuss limits of those approaches.
منابع مشابه
Automatic Detection of Online Jihadist Hate Speech
We have developed a system that automatically detects online jihadist hate speech with over 80% accuracy, by using techniques from Natural Language Processing and Machine Learning. The system is trained on a corpus of 45,000 subversive Twitter messages collected from October 2014 to December 2016. We present a qualitative and quantitative analysis of the jihadist rhetoric in the corpus, examine...
متن کاملAutomated Hate Speech Detection and the Problem of Offensive Language
A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced...
متن کاملDeep Learning for Hate Speech Detection in Tweets
Hate speech detection on Twitter is critical for applications like controversial event extraction, building AI chatterbots, content recommendation, and sentiment analysis. We define this task as being able to classify a tweet as racist, sexist or neither. The complexity of the natural language constructs makes this task very challenging. We perform extensive experiments with multiple deep learn...
متن کاملHate Speech Detection with Comment Embeddings
We address the problem of hate speech detection in online user comments. Hate speech, defined as an “abusive speech targeting specific group characteristics, such as ethnicity, religion, or gender”, is an important problem plaguing websites that allow users to leave feedback, having a negative impact on their online business and overall user experience. We propose to learn distributed low-dimen...
متن کاملRecognizing Explicit and Implicit Hate Speech Using a Weakly Supervised Two-path Bootstrapping Approach
In the wake of a polarizing election, social media is laden with hateful content. To address various limitations of supervised hate speech classification methods including corpus bias and huge cost of annotation, we propose a weakly supervised twopath bootstrapping approach for an online hate speech detection model leveraging large-scale unlabeled data. This system significantly outperforms hat...
متن کامل